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O , low-volatility economic growth; and (ii) chain-like, associated with high- volatility 

\ economic crisis. Finally, we present statistical evidence, based on the emergence 

. ^ I of a star-like MST in Sep 2009, and the MST staying robustly star-like throughout 

^ ■ the Greek Debt Crisis, that the US economy is on track to a recovery. 
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1. Introduction 



In the CBS '60 Minutes' interview televised on 15 March 2009, Ben Bemanke 
predicted that the recession triggered by the global financial crisis will end in 
2009, and the US economy will recover in 2010 |[l]]. While we will never know 
whether Bemanke made the prediction based on his gut feelings, or on simulation 
results from some sophisticated macroeconomic model, what we do know is that 
the prediction sparked intense public debate on whether the Chairman of the US 
Federal Reserve was overly optimistic. Given that the financial industry was still 
reeling from the massive October 2008 slide, reactions to Bernanke's statement 
must be especially strong. We also know that the US Federal Reserve does not 
appear to be behind its Chairman: up till September 2010, the interest rate has not 
been raised [0], even though there has been calls from within the Federal Reserve 
system to tighten the money supply JH]. This has led to mounting concerns from 
economists that the oversupply of government money, in the form of an interest 
rate that is nearly zero, will cause an inflation when the economy recovers [H,!!]. 
In fact, a commentator argued that US stimulus money is fueling property bubbles 
all over Asia, and warned that the global economy will crash once again in 2012 
when the Feds rein in their easy money [@]. 

In January 2004, there was a similar call by economists to raise interest rates 
dv]], when the US economy was showing signs of coming out of the technology 
bubble crisis. The Federal Reserve responded hesitantly only in June 2004 [2]. We 
can understand the concern of the US government then, and possibly also now: 
how do we know that the early signs will lead on to a recovery that will strengthen 
and stay the course? From these historical and contemporary lessons, we know 
that a more sensitive and more robust indicator of economic recovery is needed. 
While much work has been done on developing and validating reliable precursor 
signatures (also called leading indicators) for the onset of financial crises (see 
for example, Refs. |[8-14|), and understanding such economic disasters in general 
(see for example, Refs. lll5l - l20ll ). less has been done to find robust indicators of 
economic recovery (see for example, Refs. ll2ll-l23ll'). In this work, we hope to 
address this important gap. 

Recently, we adapted the recursive entropic segmentation method [1241 12511 de- 
veloped by Bemaola-Galvan and coworkers for biological sequence segmentation, 
and applied it to financial time series segmentation ll26ll . Based on our segmen- 
tation of the Dow Jones Industrial Average (DJIA) time series between 1997 and 
2008, we saw that the US economy, as measured by the DJIA, switched between a 
high- volatility crisis phase and a low-volatility growth phase. The first crisis phase 
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lasted from mid- 1998 to mid-2003, coinciding with the US technology bubble and 
the ensuing economic recession. The second crisis phase started in mid-2007, co- 
inciding with the US Subprime Crisis and the ensuing global financial crisis. More 
interestingly, we could also identify a year-long series of precursor shocks prior 
to the mid- 1998 and mid-2007 onsets of two crisis phases, as well as a year-long 
series of inverted shocks prior to the mid-2003 economic recovery. The series of 
inverted shocks started with the mid-2002 Dow Jones low, so if we believe the in- 
ternal dynamics of the US economy had not changed from the previous financial 
crisis to the present financial crisis, we would naively expect the US economy to 
recover one year after the March 2009 lows, i.e. the second quarter of 2010, give 
and take. 

Clearly, a single study of a single time series spanning only two financial crises 
and one growth period is hardly enough statistical evidence in Bernanke's favor. 
To enhance the statistical significance of features seen in the segmented DJIA time 
series, we carried out a cross-section study, comparing the segmented time series 
of the ten Dow Jones US (DJUS) economic sector indices ll27n . By identifying the 
sequences of onsets in the ten DJUS indices, we find sectors in the US economy 
going first to last into the present financial crisis in merely two months! While we 
may or may not have an extended sequence of precursor shocks to work with for 
predicting market crashes and financial crisis (see the recent update [|28n on the 
heroic efforts by Somette and coworkers), when the dominoes are set in motion 
policy makers will have a month or two to contain the crisis. Since this financial 
crisis eventually spread globally, we will have to wait for the next potential crisis 
to find out if containment is at all possible. We do know, however, that the US 
Federal Reserve acted promptly, announcing the first of a series of interest rate 
cuts in August 2007. Unfortunately, as detailed in Ref. ll27n . we saw these rate 
cuts rapidly losing effectiveness. A critical discussion on the actions taken by the 
US Federal Reserve can be found in Ref. f23\. 

In the same comparative study, we also identified the sequence of economic 
recoveries in the different US economic sectors. The excruciatingly slow com- 
plete economic recovery from the previous financial crisis, defined as consistent 
growth in the first sector to consistent growth in the last sector, took one and a 
half years. Given this long time scale, developing robust indicators to detect eco- 
nomic recovery, and thereafter designing timely stimulus packages, should be eas- 
ier than finding sensitive indicators that would warn us of an impending financial 
crisis. We would imagine that tracking slow month-to-month indicators should be 
enough to give us a confident forecast on the start of growth, but all through the 
second half of 2009 and 2010 to date, we hear commentators mostly urging cau- 
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tion BOMBTII . We believe this cautious outlook can be blamed partly on swings in 
the stock markets, which always become strong when things are taking a turn for 
the better or for the worst. Perhaps the way to allay such market-driven fears is to 
extract convincing signs from the high-frequency stock-market data itself. Based 
on these signs, policy makers can then tell more confidently that the economy will 
recover in a matter of months, and start planning measures to further stimulate the 
recovery. 

In this paper, which is organized into six sections, we report a minimal span- 
ning tree (MST) study of the segmented time series of the ten DJUS economic 
sector indices. In Section [2l we describe the data sets studied, and the statisti- 
cal methods used to analyze them. In Section [3l we examine the gross structure 
of the 10-sector MST over the 2000 to 2009 period, as well as those over the 
2002-2003 crisis period, the 2004-2005 growth period, and the 2008-2009 crisis 
period. We explain the macroeconomic significance of the core-fringe structure of 
the MSTs, and also suggest why the MSTs organize themselves into a star topol- 
ogy during growth, and into a chain topology during crisis. Then, in Section H we 
construct MSTs within segments associated with distinct macroeconomic phases 
to study the correlational dynamics within the US economy. We again find that 
the MST is star-like in low-volatility segments, and chain-like in high- volatility 
segments. This tells us that the star-like MST is a robust and reliable character of 
economic well being. By combining temporal information obtained through sta- 
tistical segmentation and clustering, we show that volatility shocks always start at 
the fringe and propogate inwards. Some of the links to leader sectors have anoma- 
lously high cross-correlations. We also check whether such volatility shocks have 
a more domestic or more global origin. Finally in Section [5l by examining a 
nearly contiguous sequence of corresponding segments, we look at how the MST 
rearranges in the pre-recovery periods for both the previous and the current finan- 
cial crises. We found very violent rearrangements prior to the previous economic 
recovery. For the present financial crisis, we can see clear signatures of star-to- 
chain and chain-to-star rearrangements, accompanied by the expected changes in 
market volatilities and cross-correlations. This suggests that the US market has 
become more efficient, as far as processing information is concerned, over the past 
5-10 years. After predicting that the US economy will recover in early 2010, we 
summarize our findings in Section |6l 
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2. Data and methods 



2.1. Data 

Tic -by-tic data for the ten Dow Jones US (DJUS) economic sector indices (see 
Table [H for the indexing scheme i = 1, . . . , 10 used) over the period 14 February 
2000 to 25 November 2009 were downloaded from the Thomson-Reuters Tick- 
history (formerly known as Taqtic) database 113 811 . These were then processed 
into time series X, = (X, i, . . . . . . ,Xi N) at fixed time intervals indexed by 

I < t < N . Since financial markets are known to exhibit complex dynamics on 
multiple time scales, the data frequency has to be carefully selected. In the fi- 
nancial economics literature, intervals ranging from 5 to 60 minutes have been 
used for estimating realized or benchmark daily volatilities for foreign exchange 
or stock market time series [|39l-l43ll. In general, higher data frequencies are not 
employed due to worries about the effects of market microstructures. 



Table 1: The ten Dow Jones US economic sector indices as defined by the Industry Clas- 
sification Benchmark (ICB). These are float-adjusted market capitalization weighted sums 
of variable numbers of component stocks, introduced on February 14, 2000 to measure the 
performance of US stocks in the ten ICB industries. The makeup of these indices are reviewed 
quarterly, and the number of components and float-adjusted market capitalizations taken from 

^ttp : //www. dj indexes . com/mdsidx/downloads/f act_inf o/Dow_Joiies_US_Iiidexes_Industry_Indexes_Fact_£ 
and are accurate as of November 30, 2010. The top components of each index are shown in 
Appendix Appendix A Although they do not make up any of the top spots, homebuilders and 
developers are components of the consumer goods (NC) sector 



/ 


symbol 


sector 


number of 
component stocks 


float-adjusted market 
capitalization (billion USD) 


1 


BM 


Basic Materials 


155 


506.7 


2 


CY 


Consumer Services 


484 


1,649.1 


3 


EN 


Oil & Gas 


214 


1,405.7 


4 


FN 


Financials 


876 


2,192.5 


5 


HC 


Healthcare 


512 


1,423.8 


6 


IN 


Industrials 


692 


1,725.7 


7 


NC 


Consumer Goods 


326 


1,351.1 


8 


TC 


Technology 


509 


2,158.1 


9 


TL 


Telecommunications 


44 


379.5 


10 


UT 


Utilities 


96 


470.9 



We chose to sample the time series at 30-minute intervals. As explained 
in Ref. ll26ll . the half-hourly data frequency allows us to confidently identify 
statistically stationary segments as short as a day. Higher data frequency was 
not used, because in a macroeconomic study such as this, we are not interested 
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in segments shorter than a day, i.e. the intraday market micro structure. From 
the index time series X,, we then prepare the log-index movement time series 
X,- = (^,,1, . . . , Xjj, Xi^N-i), where x,-, = logX,,,+i - logX,;,. We work with log- 
index movements, because different indices have different magnitudes, and it is 
more meaningful to compare their fractional changes. 

2.2. Segmentation and clustering as discovery tools in statistical physics 

Before we go on to describe in greater details the segmentation and clustering 
methods we used in this study, let us disgress to discuss how segmentation and 
clustering can be useful tools for making discoveries in statistical physics. To be- 
gin, let us consider how the thermodynamic phase diagram of a given system, like 
that shown in Fig.[Tl can be determined. Typically, the thermodynamic phases are 
characterized by order parameters, which are frequently macroscopic quantities 
like density, magnetization, or electric polarization. Order parameters have the 
property that they have different values in different phases. Hence, so long as the 
length and time scales of our system of interest are not too large, we can mea- 
sure all macroscopic quantities experimentally, see which changes sharply as we 
go from one phase to another to identify the macroscopic order parameters. We 
can then perform even more careful measurements on these macroscopic order 
parameters to construct the phase diagram. 

When the length or time scales of our system of interest are too large, for ex- 
ample, in protein folding (small length scale, but long time scale), or a nuclear 
weapon detonating in a city (small time scale, but long length scale), or climate 
change (long length scale and long time scale), it may no longer be possible to 
identify the order parameters empirically. However, it may be comparatively easy 
to obtain high-frequency time series data of any number of microscopic quanti- 
ties. To see how these microscopic time series can be useful towards our eluci- 
dation of the phase diagram, we must first understand that each thermodynamic 
phase is associated with a low-dimensional manifold in the phase space of our 
system. Such low-dimensional manifolds arise because of conservation laws, or 
through interaction-driven self-organization, or both. In this statistical mechan- 
ical picture, order parameters are the thermodynamic coordinates of individual 
low-dimensional manifolds. When the system is in a given phase, microscopic 
variables fluctuate about the associated low-dimensional manifold, which typi- 
cally has slow dynamics. When the system makes a transition into another phase, 
the slow dynamics and the low-dimensional manifold changes, and so does the 
fast fluctuations of the microscopic variables. 
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temperature 



Figure 1 : A typical phase diagram showing where the soHd, liquid, and gas phases of a substance 
occurs in the pressure-temperature (p-T) plane. Also shown in the figure are the equilibrium 
fluctuations 6r in the displacement of a given atom in the (a) solid phase, with time-independent 
variance {{Srf-) T; (b) liquid phase, with a diffusive variance {{6r)~} oc t; and (c) gas phase, with 
long ballistic hfetimes. 



In Fig. [H we illustrate this connection between the character of fast fluctua- 
tions and the underlying slow dynamics, by showing the equilibrium fluctuations 
6r in the displacement of a given atom. In Fig. [IJa), we show the non-diffusive 
equilibrium fluctuations 6r in the solid phase. In this phase, the variance {(6r)^) 
of these fluctuations is proportional to the temperature T, but otherwise time in- 
dependent. In Fig. lHh), we show the dilTusive equilibrium fluctuations 6r in the 
liquid phase. In this phase, the variance {{6r)^) of the fluctuations grows with 
time. Finally, in Fig.HJc), we show the diffusive equilibrium fluctuations 6r in the 
gas phase. Diffusive fluctuations in the gas phase can be distinguished from those 
in the liquid phase by the long ballistic lifetimes in the former. 

Now, suppose different phases are present in the system, like the ice-water 
example shown in Fig. [2l In this example, we can conclude that two phases are 
present, by sorting the atomic displacement time series into two groups, based on 
the fluctuation chacracteristics discussed above. Alternatively, we can cluster the 
microscopic time series, to find the statistically distinct ice and liquid phases. We 
might even be able to detect the convection currents present in the system. More- 
over, if the system undergoes repeated phase transitions, the phases that appear in 
the history of the system can be discovered by first segmenting the microscopic 
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Figure 2: Microscopic time series within a system with mixed phases, in this example, ice and 
water Based on our understanding of the character of microscopic fluctuations, the time series 
can be sorted into an ice cluster and a water cluster Alternatively, these clusters can be discovered 
using statistical clustering methods, without any prior knowledge of the phases and their statistical 
properties. 



time series, and then clustering the time series segments. So long as there is ad- 
equate data, and the microscopic fluctuations in difi'erent phases are sufficiently 
distinct, we can always discover how many such phases are present through sta- 
tistical segmentation and clustering of the microscopic time series. These two 
generally robust procedures are very well suited to the study of complex systems, 
for which we have high frequency time series data, but no idea how many phases 
are present, and what their characteristics are. In addition to the macroeconomic 
study reported in this paper, and related studies on the Japanese economy, the 
Asian and European economies, we are also applying the two methods to under- 
stand protein folding dynamics and earthquake dynamics. 

2.3. Segmentation 

Financial time series are well known to be highly nonstationary. In particular, 
several recent studies revealed that the instantaneous volatility fluctuates about a 
constant level, before switching over rapidly to fluctuations about a difi'erent con- 



stant level Il44l - l46ll . Based on these, and similar earlier observations, economics 



and finance practitioners explored various methods for decomposing a nonsta- 
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tionary time series into stationary segments, which are called regimes or trends 
in the economics and finance literatures. In these literatures, segment boundaries 
are referred to as structural breaks, trend breaks, or change points. The earliest 
works in this field are by Goldfeld and Quandt [|47n . and by Hamilton [|48n . Since 
these pioneering works, an enormous economics literature on structural breaks 
and change points has been amassed, a few based on the original Markov switch- 
ing modelsJ49|], and many others based on autoregressive models and unit-root 
tests ll50l-l64ll . In the econophysics literature, apart from our own work, we are 
aware only of the work by Vaglica et ah, who broke the transaction histories of 
three highly liquid stocks on the Spanish stock market into directional segments 
to study trading strategies adopted in this market feSH . and the recent preprint by 
Toth et al. , who segmented the time series of market orders on the London Stock 
Exchange, modeling each segment by a stationary Poisson process ll66n . 

As with all model-driven segmentation of time series data, we assume that 
each economic sector time series x, consist of M, segments, and that within seg- 
ment m,, the log-index movements x"'/ follows a stationary statistical distribu- 
tion. From the seminal work by Mantegna and Stanley [|67D, we know that high- 
frequency index movements can be fitted very well to stable Levy distributions. 
We also know from the study by KuUmann et al. mm that the daily log-index 
movements can be fitted well to a truncated Levy distribution, when the sample 
size is small, but becomes normally distributed when the sample size is large. This 
suggests that the appropriate model for each stationary segment ought to be a Levy 
stable process. However, parameter estimation for Levy stable distributions 1169 - 



7611 is a computationally expensive process, and computing the probability density 



1 77l-l8 Ih is equally tedious. From our experience segmenting biological sequences, 
we know that segment boundaries that are statistically very significant can be dis- 
covered by any segmentation procedure, no matter what model we assumed for 
the underlying stationary segments. We believe that the most statistically signif- 
icant segment boundaries in financial time series would also be equally insensi- 
tive to choice of model, or model mis-specification. Indeed, when we compared 
segments of the 2002-2003 DJIA half-hourly time series obtained assuming that 
the log-index movements are normally distributed, against those obtained assum- 
ing the log-index movements are Levy stable distributed, the strongest segment 
boundaries are in good agreement (no more than two days apart) ll82n . With this 
reassurance, we chose to intentionally mis-specify the model, and work instead 
with the lognormal index movement model. In this model, the log-index move- 
ments in segment m; are assumed to follow a stationary Gaussian process with 
mean yu, ,„. and variance cr^„, . Unlike parameter estimation for the Levy distribu- 
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tion, maximum-likelihood estimates of the Gaussian parameters yu, and cr^^, can 
be done very cheaply. 

To find the unknown segment boundaries which separates segments m, 
and m, + 1, we use the recursive segmentation scheme introduced by Bemaola- 
Galvan et al. ll24l l25n. In this segmentation scheme, we start with the time series 
x = {xi, . . . ,Xt, Xt+i, . . ., Xn), and compute the Jensen-Shannon divergence 118311 



A(0 = In 



Liit) 



(1) 



where within the log-normal index movement model, 
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is the likelihood that x is generated probabilistically by a single Gaussian model 
with mean // and variance cr^, and 
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(3) 



is the likelihood that x is generated by two statistically distinct models: the left 
segment x^ = {xi, . . . , jc,) by a Gaussian model with mean /i^ and variance (t\, 
and the right segment x^ = (jCf+i, . . . , x„) by a Gaussian model with mean 
and variance cr\. In terms of the maximum likelihood estimates ju,fiL,fiR and 
(T^, the Jensen-Shannon divergence A(f), which measures how much better 
a two- segment model fits the time series data compared to a one-segment model, 
simplifies to 

A(t) = nlnd- - tln&L - (n - t)\nd-R + ^> 0. (4) 

Scanning through all possible times t, a cut is then placed at f, for which the 
Jensen- Shannon divergence 



A* = A(f) = maxA(0 

t 



(5) 



is maximized, to break the time series x = (jci, . . . , x„) into two statistically most 
distinct segments x^ = {xi,...,Xf) and x^ = . . . , x„) (see for example, 

Fig.|3]). This one-into-two segmentation is then applied recursively onto x^ and x^ 
to obtain shorter and shorter segments (see also Fig.|3l for example). At each stage 
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of the recursive segmentation, we also optimize the segment boundaries using the 
first-order algorithm described in Ref. ll84ll . where we recompute the optimum 
position of segment boundary m, within the time series subsequence bound by 
segment boundaries m + 1 . This is done iteratively for all segment boundaries, 
until they have all converged onto their optimum positions. 



2500- 




Figure 3: The Jensen-Shannon divergence spectrum for the DJIA time series from Jan 1997 to 
Aug 2008 (red). This is a typical spectrum consisting of one very strong peak, in this example, 
at mid-2003. Also show are the Jensen-Shannon divergence spectra for the left segment (green, 
1997 to mid-2003) and the right segment (blue, mid-2003 to Aug 2008) obtained at the second 
stage of the recursive segmentation. In this example, the two segments have divergence maxima 
at mid-2002 and mid-2007 respectively. 



As the optimized recursive segmentation progresses, the Jensen-Shannon di- 
vergence of newly discovered segment boundaries, as well as the previously dis- 
covered segment boundaries, will in general become smaller and smaller. Seg- 
ment boundaries thus become less and less significant statistically, and at some 
point, we must terminate the recursive segmentation. There are three ways to do 
so. In the first approach, the Jensen-Shannon divergences of new segment bound- 
aries are tested for statistical significance against various distributions with the 
appropriate degrees of freedom [|24l l25|l . The recursive segmentation terminates 
when no new segment boundaries more significant than the chosen confidence 
level p can be found. In the second approach, new segment boundaries are ac- 
cepted if the information criteria of the two-segment models they imply are larger 
than the information criteria of the one-segment models we are selecting against 
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i85L 18611 ■ Here, the recursive segmentation terminates when further segmentation 
does not explain the data better. In the third approach, we define signal-to-noise 
ratios based on the Jensen-Shannon divergence fluctuations within supersegments 
that new segment boundaries are supposed to divide [,84,1 . The recursive segmen- 
tation terminates when the signal-to-noise ratios of all new segment boundaries 
fall below a chosen threshold value. 

Alternatively, we could also terminate the recursive segmentation when no 
new optimized segment boundaries with Jensen-Shannon divergence greater than 
a cutoff of Ao = 10 are found. The short and medium segments produced by this 
termination criterion are reasonable, but the long segments obtained tend to have 



internal segment structures masked by their context 18711 . We then recursively seg- 
ment these long segments, by progressively lowering the cutoff Aq until a segment 
boundary with strength A > 10 appears. The final segmentation then consists of 
segment boundaries discovered through the automated recursive segmentation, as 
well as segment boundaries discovered through progressive refinement of overly 



long segments. Based on the experience in our previous works 11261 12711. this semi- 



automatic recursive segmentation appears to produce acceptable results. 

2.4. Segment clustering 

After the segmentation is completed, we obtain a large number (typically 
> 100) of segments for each time series. While successive segments are statis- 
tically distinct from each other, segments that are far apart can actually be sta- 
tistically similar. This observation suggests that the large number of segments 
make up a small number of segment classes. By comparing multiple indicators, 
economists classified different market periods into four macroeconomic phases or 
regimes: (i) a growth phase; (ii) a contraction phase; (iii) a correction phase; and 
(iv) a crash phase. We therefore expect the time series segments to also be orga- 
nized into roughly four classes. A similar problem arise in biological sequences, 
where thousands of segments can be organized into tens of segment types that 
differ in their biological functions. In the ground-breaking paper by Azad et ah, 
the 248 segments of human chromosome 22 was classified into 53 domain types 



using single-link hierarchical clustering [[880. Inspired by this prospect of reduc 



ing the complexity of our segmentations, we performed independent hierarchical 
agglomerative clusterings on the segments within each US economic sector time 
series, using the complete link algorithm (see Ref. [i89i1 for details on the com- 
plete link algorithm, and also a review on the broad area of statistical clustering). 
We chose the complete link algorithm, which is favored by social scientists for 



producing compact and internally homogeneous clusters [19011 . because our goal 
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is to discover macroeconomic phases with well-defined statistical properties. We 
do not use the far more popular single link algorithm 119 iL 19211. because it tends to 
produce loose and elongated clusters [93]. Single link clustering is more meaning- 
ful in the biological sciences, generating phylogenetic trees for example, because 
the clustering procedure corresponds more closely with the nature of evolution- 
ary changes. In general, if one expects to find highly homogeneous collections 
of objects one would use complete link clustering, whereas if one expects to find 
collections of objects that evolved from common ancestors one would use single 
link clustering. 




Figure 4: The complete-link hierarchical clustering tree for the time series segments of the Dow 
Jones Industrial Average between January 1997 and August 2008. In this tree, we show the Jensen- 
Shannon divergence values at which the top branches diverge. We also show how uniform thresh- 
olds can be selected to break the tree into two, three, four, or six clusters. Finally, we show how 
individual thresholds can be selected to obtain the six clusters reported in Ref ll26ll . which are 
colored in increasing order of market volatility as deep blue, blue, green, yellow, orange, and red. 

In this segment clustering, we used the Jensen-Shannon divergences between 
segments as their statistical distances. Clustering of different periods within a fi- 
nancial time series has been previously investigated [|94l- l96|l . in the absence of 
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any segmentation analysis. After complete-link hierarchical clustering of the seg- 
ments within a given time series, we typically end up with a dendrogram like 
that shown in Fig. |4] for the Dow Jones Industrial Average. By varying a uni- 
form threshold, different number of clusters can be identified. A small number of 
clusters provide a coarser description, whereas a larger number of clusters offer a 
finer description of the dynamics within the time series. This different numbers of 
clusters form a nested hierarchy of coarse-grained descriptions of the US macroe- 
conomic dynamics. All these descriptions are correct, but some are more useful 
than others, because they are statistically more robust. For example, in Fig. IH 
we can identify four clusters if the uniform threshold is 42.7 < A < 102.2, or 
five clusters if the uniform threshold is 34.4 < A < 42.7, or six clusters if the 
uniform threshold is 31.3 < A < 34.4. Because of the broader range of uniform 
thresholds, a four-cluster description is statistically more robust than a five-cluster 
or six-cluster descriptions of the time series segments. 

Once we understand statistical robustness as the primary criterion for cluster 
selection, we can also employ local thresholds for each cluster. In Fig.|4l we show 
as an example the local thresholds used to pick six clusters. Other local thresholds 
can also be used, but so long as they are statistically robust, one choice of clusters 
offer no advantage over another choice of clusters. These different choices of 
clusters tell the same story, merely with different contrasts, very much like red- 
tinted and blue-tinted versions of the same photograph. With this in mind, we 
analyzed the hierarchical complete-link clustering trees obtained for all ten DJUS 
economic sectors, and selected between four to six clusters of segments for each 
US economic sector. These clusters represent different macroeconomic phases 
(differentiated by their market volatilities) present in the time series data. Once all 
segments have been assigned to their respective clusters, we use the heat-map-like 
color scheme in Table [2] to plot the temporal distributions of clustered segments. 
All the analyses presented in this paper are based on features identified from the 
temporal distributions of clustered segments for the ten DJUS economic sector 
indices. 



2.5. Identifying corresponding segments 

Of the many features that we can identify from individual temporal distribu- 
tions, as well as across the panel of temporal distributions, corresponding seg- 
ments that appear in all or most of the indices are the most striking visually. In the 
economics and finance literature, a mean or vol atilit y movement that occurs ove r 
multiple time ser ies i s call ed comovement [|97| - 1 10311. comm on jumps [Il04l - ll06n . 
common shocks 11 1071 - 1 10911 . or common breaks lIllOl - lllSll . The consensus that 



14 



Table 2: Heat-map-like color scheme for the different volatility clusters, and the macroeconomic 
phases they correspond to. The crisis phase, which consists of the high-volatility (yellow) and 
very-high-volatility (orange) clusters, is significantly longer than the economic contraction phase 
accepted by economists. In fact, economic contraction, as determined by successive quarters of 
contraction in the GDP, typically occurs at the end of a crisis phase. Also shown are the average 
standard deviation in each phase for the various economic sectors. The extremely-low-volatility 
phase is seen only in CY, EN, and FN, because the time series segments are organized into six clus- 
ters for these three sectors only. As we can see, within each macroeconomic phase, the average 
volatilities discovered by the clustering procedure are fairly consistent throughout most sectors. 
The exceptions are HC and TL, which have consistently lower volatilities. It is possible to in- 
troduce a seventh cluster with volatility cr ^ 0.008, and have all the clusters reclassified. This 
will produce a deterministic mapping between volatility and color, but we choose not to, so as to 
achieve maximum visual contrast with the present color scheme. 



volatility 


extremely low 


low 


moderate 


high 


very high 


extremely high 


color 


black 


blue 


green 


yellow 


orange 


red 


phase 


growth 


correction 


crisis 


crash 


BM 




0.0016 


0.0037 


0.0046 


0.0069 


0.0146 


CY 


0.0005 


0.0015 


0.0023 


0.0031 


0.0053 


0.0121 


EN 


0.0010 


0.0014 


0.0027 


0.0037 


0.0058 


0.0152 


FN 


0.0007 


0.0016 


0.0024 


0.0039 


0.0058 


0.0134 


HC 




0.0006 


0.0016 


0.0023 


0.0041 


0.0076 


IN 




0.0013 


0.0022 


0.0035 


0.0056 


0.0140 


NC 




0.0009 


0.0015 


0.0022 


0.0034 


0.0085 


TC 




0.0019 


0.0030 


0.0042 


0.0082 


0.0121 


TL 




0.0008 


0.0018 


0.0024 


0.0033 


0.0078 


UT 




0.0014 


0.0023 


0.0030 


0.0038 


0.0088 



arise from this body of work is that the statistical significance of a change point is 
amplified by the cross section it occurs concurrently over. 

In our study, the corresponding segments do not necessarily start at the same 
time, because our use of high-frequency data allows us to identify the change 
points that are individually optimum for the ten DJUS economic sector indices. 
More importantly, our correspondin g se gments in the various indices do not end at 
the same time. As discussed in Ref. [|27n . the durations of each corresponding seg- 
ment, and the Jensen-Shannon divergence values at the start of these segments, tell 
us how strongly the shock impacted diff"erent sectors in the US economy. More- 
over, the difi'erent start times of the corresponding segments allow us to roughly 
map out the progress of the shock. 

Because of the difi'erent start times and difi'erent durations, we mark segments 
in the ten DJUS economic sector indices as corresponding segments if they (i) 
have similar volatilities (high and high, or low and low); or (ii) are fianked by 
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volatility movements in the same directions(low-to-high and moderate-to-high, or 
high-to-low and moderate-to-low). For this, we took advantage of the heat- map- 
like color scheme in the temporal distributions. 

2.6. Cross-correlations 

In performing segmentation and thereafter segment clustering, we have selec- 
tively discarded information contained in the ten high-frequency time series to 
obtain a coarse-grained picture of the US macroeconomic dynamics. While this 
picture provides a useful bird's eye view of the dynamical processes within the US 
economy, a significant amount of useful information has also been thrown out. To 
recover more of the information contained in the high-frequency time series, and 
shed more light on the exciting stories unfolding before our eyes, we compute the 
normalized cross-correlation matrix C, whereby the matrix element 

„ Yjt=Yi^it ~ jt ~ ^ j) 

Cij = I ^ _ = (6) 



is the zero-lag cross-correlation between US economic sectors / and j. 

Cross-correlations between different stocks, and between different benchmark 
indices have been widely studied in the finance literature. Such studies have been 
particularly popular in the bid to understa nd the me ltdown of global financial 



markets during the present financial crisis [|ll7l - ll22|] . In the econophysics lit 



erature, there have been attempts to understand the nontrivial cros s-correlat ions 



between different financial time series using random matrix theory [11231 - 1 13011 . In 
all these studies, the cross-correlations were computed either over the entire data 
period, or over sliding windows. In our own study, we not only calculate the 
cross-correlation matrix over the entire duration of the time series, but also over 
two-year intervals strictly within the growth and crisis macroeconomic phases, 
and over individual corresponding segments. 

To compute the cross -correlation matrix over a given corresponding segment, 
we select the largest interval within which most sectors can be found in a sin- 
gle macroeconomic phase, as shown in Fig. [5l For the primarily high- volatility 
corresponding segment identified in Fig. [5l the start dates and end dates in the 
different sectors are shown in Tabled Based on these dates, we chose our interval 
to start on July 25, 2007, and end on August 14, 2007, so that all DJUS economic 
sectors, with the exception of IN, TC, and UT, are strictly in the high-volatility 
phase. Within this interval, UT is strictly in the extremely-high-volatility phase. 
The selected high-volatility interval overlaps with the moderate-volatility phases 
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Figure 5: Interval selected for the computation of the cross-correlation matrix between the ten 
DJUS economic sector indices. In this figure, the lower and upper limits of the interval are chosen 
such that the interval covers a single macroeconomic phase for nearly all indices. Exception is 
made for IN, because its high-volatility segment is too short, and thus the selected interval also 
covers the preceding moderate-volatility segment. 

in IN and TC. This cannot be helped, because the high-volatility phases in IN and 
TC started so late. Furthermore, we believe it is meaningful to allow this overlap, 
because the start dates of the moderate- volatility segments in IN and TC are very 
close to the start dates of the high- volatility segments in the other sectors. 

In Table m we show the cross -correlation matrix computed using (a) this opti- 
mum interval, such that apart from IN and TC, all sectors are in the high- volatility 
or extremely-high- volatility phases. To show that the criterion we used to se- 
lect interval (a) produce statistically robust cross correlations, we compare them 
against cross correlations computed using the intervals (b) and (c). Interval (b) 
is four trading days shorter than interval (a), with two trading days taken off the 
latter's start and end. As with interval (a), all sectors apart from IN and TC are 
in the high-volatility or extremely-high-volatility phases within this interval. In 
contrast, interval (c) is four trading days longer than interval (a), with two trading 
days added to the latter's start and end. Unlike for intervals (a) and (b), cross cor- 
relations computed within this longer interval will contain contributions from the 
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Table 3: Start dates and end dates in different DJUS economic sectors for the primarily high- 
volatility corresponding segment identified in Fig.|5] The high- volatility segments for IN and TC 
started significantly later than those in other sectors, so we also show the start and end dates for 
the preceding moderate-volatility segments for these two sectors. For UT, it is a extremely-high- 
volatility segment that coincides with the high- volatility segment in other sectors. 



Economic Sector 


Start Date 


End Date 


BM 


July 23, 2007 


August 14, 2007 


CY 


July 25, 2007 


August 16, 2007 


EN 


July 23, 2007 


August 15, 2007 


FN 


July 23, 2007 


August 15, 2007 


HC 


July 20, 2007 


August 15, 2007 


IN 


1 July 19, 2007 


August 7, 2007 1 




August 8, 2007 


August 16, 2007 


NC 


July 25, 2007 


September 9, 2007 


TC 


B July 17, 2007 


August 14, 2007 1 




August 15, 2007 


August 16, 2007 


TL 


July 20, 2007 


August 16, 2007 


UT 





lower- volatility phases adjacent to the high-volatility corresponding segment for 
nearly all sectors. As we can see from Table |4l the maximum positive difference 
between cross correlations in (b) and cross correlations in (a) is -1-0.063, occurring 
for CY-TL, and the maximum negative difference between cross correlations in 
(b) and cross correlations in (a) is -0.029, occurring for HC-EN. Similarly, the 
maximum positive difference and maximum negative difference between cross 
correlations in (c) and cross correlations in (a) is -1-0.041 and -0.072, occurring 
for HC-EN and BM-FN, respectively. The differences in cross correlations is gen- 
erally larger for the longer interval (c) than for the shorter interval (b), because 
in (b), the interval contains a single statistically stationary segment for most sec- 
tors, whereas in (c), the interval incorporated time series data from more than one 
segments. Nonetheless, these cross correlational differences are all small. 

To understand what impacts these small cross correlational differences have 
on the minimal spanning trees (described in the next subsection) we generate for 
this study, we show in Fig. [6] the minimal spanning trees derived from the cross- 
correlation matrices for these three intervals. In this figure, we see the same EN- 
BM-IN-CY-NC-HC backbone, which is insensitive to how we select the interval, 
and thus statistically robust. We also see that the sectors TC, TL, and UT, are 
linked sometimes to IN, and other times to CY. Going back to the cross correla- 
tions shown in Table |4] between TC, TL, UT and CY, IN, we find poor agreements 
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Table 4: Cross-correlation matrix computed from the half-hourly times series of the ten DJUS 
economic sector indices over three intervals: (a) from July 25, 2007 to August 14, 2007; (b) from 
July 27, 2007 to August 10, 2007; and (c) from July 23, 2007 to August 16, 2007. 



(a) 





BM 


CY 


EN 


FN 


HC 


IN 


NC 


TC 


TL 


UT 


BM 




0.872 


0.825 


0.898 


0.726 


0.900 


0.789 


0.818 


0.709 


0.832 


CY 


0.872 




0.753 


0.898 


0.826 


0.915 


0.876 


0.856 


0.768 


0.835 


EN 


0.825 


0.753 




0.750 


0.663 


0.776 


0.720 


0.745 


0.607 


0.759 


FN 


0.898 


0.898 


0.750 




0.771 


0.889 


0.845 


0.827 


0.741 


0.814 


HC 


0.726 


0.826 


0.663 


0.771 




0.827 


0.913 


0.804 


0.772 


0.770 


IN 


0.900 


0.915 


0.776 


0.889 


0.827 




0.861 


0.877 


0.808 


0.842 


NC 


0.788 


0.876 


0.720 


0.845 


0.913 


0.861 




0.852 


0.783 


0.819 


TC 


0.818 


0.856 


0.745 


0.827 


0.804 


0.877 


0.852 




0.769 


0.783 


TL 


0.709 


0.768 


0.607 


0.741 


0.772 


0.808 


0.783 


0.769 




0.729 


UT 


0.832 


0.835 


0.759 


0.814 


0.770 


0.842 


0.819 


0.783 


0.729 




(b) 




BM 


CY 


EN 


FN 


HC 


IN 


NC 


TC 


TL 


UT 


BM 




0.888 


0.807 


0.891 


0.713 


0.909 


0.783 


0.843 


0.746 


0.823 


CY 


0.888 




0.752 


0.928 


0.848 


0.932 


0.910 


0.904 


0.831 


0.859 


EN 


0.807 


0.752 




0.728 


0.634 


0.771 


0.703 


0.747 


0.619 


0.736 


FN 


0.891 


0.928 


0.728 




0.770 


0.894 


0.849 


0.849 


0.770 


0.808 


HC 


0.713 


0.848 


0.634 


0.770 




0.826 


0.921 


0.793 


0.784 


0.772 


IN 


0.909 


0.932 


0.771 


0.894 


0.826 




0.880 


0.880 


0.830 


0.847 


NC 


0.783 


0.910 


0.703 


0.849 


0.921 


0.880 




0.865 


0.816 


0.838 


TC 


0.843 


0.904 


0.747 


0.849 


0.793 


0.880 


0.865 




0.764 


0.802 


TL 


0.746 


0.831 


0.619 


0.770 


0.784 


0.830 


0.816 


0.764 




0.768 


UT 


0.823 


0.859 


0.736 


0.808 


0.772 


0.847 


0.838 


0.802 


0.768 




(c) 




BM 


CY 


EN 


FN 


HC 


IN 


NC 


TC 


TL 


UT 


BM 




0.837 


0.835 


0.826 


0.738 


0.912 


0.770 


0.840 


0.705 


0.821 


CY 


0.837 




0.771 


0.891 


0.847 


0.906 


0.891 


0.868 


0.787 


0.844 


EN 


0.835 


0.771 




0.743 


0.704 


0.802 


0.736 


0.775 


0.639 


0.783 


FN 


0.826 


0.891 


0.743 




0.785 


0.860 


0.853 


0.824 


0.737 


0.804 


HC 


0.738 


0.847 


0.704 


0.785 




0.842 


0.914 


0.829 


0.784 


0.793 


IN 


0.912 


0.906 


0.802 


0.860 


0.842 




0.860 


0.902 


0.808 


0.838 


NC 


0.770 


0.891 


0.736 


0.853 


0.914 


0.860 




0.868 


0.792 


0.826 


TC 


0.840 


0.868 


0.775 


0.824 


0.829 


0.902 


0.868 




0.785 


0.796 


TL 


0.705 


0.787 


0.639 


0.737 


0.784 


0.808 


0.792 


0.785 




0.746 


UT 


0.821 


0.844 


0.783 


0.804 


0.793 


0.838 


0.826 


0.796 


0.746 
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over the three intervals for the CY-TC, CY-TL and CY-UT cross correlations. In 
contrast, the IN-TC, IN-TL, and IN-UT cross correlations are in good agreement 
over the three intervals. This suggests that the minimal spanning links between 
TC, TL, UT and CY, IN are not as statistically robust as the rest of the minimal 
spanning links in this corresponding segment. In any case, we shall see in Section 
[3]that CY and IN are core sectors, while TC, TL, and UT are fringe domestic sec- 
tors of the US economy. Whether they are to IN or CY, minimal spanning links 
directly connect TC, TL, UT from the fringe to CY, IN in the core. This direct 
core-fringe linkage in itself represents a statistically robust characteristic of the 
US economy. 



EN 



BM 



FN 



UT 



TC 



CY 



NC 



HC 



(a) 



EN 




BM 









TC 



UT FN 



(b) 



TL 



EN 




BM 









TC 



UT FN 



(C) 



Figure 6: The minimal spanning trees of the ten DJUS economic sectors, constructed using half- 
hourly time series from (a) July 25, 2007 to August 14, 2007; (b) July 27, 2007 to August 10, 
2007; and (c) July 23, 2007 to August 16, 2007. 
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2. 7. Minimal spanning trees 

Even though our cross-correlation matrices are only 10 x 10 in size, the in- 
formation contained in the 36 independent matrix elements is still not easy for a 
human to process. To better understand the correlational dynamics between the 
US economic sectors at different times, we look instead at simplified graphical 
representations of the cross -correlation matrices. For this study, we work primar- 
ily with the minimal spanning tree (MST) representation of the cross-correlation 
matrix. In Section |431 We also also explore the planar maximally filtered graph 
(PMFG) representation, to understand what kind of cross-correlational structures 
have been left out in the MST representation, in which cycles are not admitted. 
Ultimately, if we had analyzed the cross correlations net work of al l stocks on the 
American stock markets, community detection methods [Il3ll - ll37n will allow us 
to develop a coarse grain description of the US economy, and thereby shed more 
light on its dynamics. In this study, the use of such techniques are not neces- 
sary, since we are only looking at ten DJUS economic sector indices, which are 
themselves already coarse grain descriptions of the US markets. 

The minimal spanning tree (also called minimum spanning tre e) app roach to 
unde rstanding weighted graphs is frequently credited to Kruskal [|l38n or Prim 
[113911, although there were studies dating all the way back to 1926. For a good 
reading on the hist ory o f the minimal spanning tree method, see the artic le by 
Grah am and Hell 11 14011 . In economics, the method is not widely used lll41 . 
142ll . However, since its first application in econophysics by M antegna 11 14311 . 



and shown to be a robust caricature of the underlying correlations lll44i 114511 . the 
MST has been inco rporated in to the basic tool suite for statistical analysis of fi- 
nancial market data lll46l - ll52ll . In particular, Onnela et al. made extensi ve us e 
ofMSTs to study the dynamics of cross correlations during market crashes 11153 - 
15511 . Clustering techniques based on the MS T have also been used to discover 



European eco nomies ar e evolving 1|16 




different sectors in a stock market ill 56 416111 . how the interdependences of the 



163n . and how global markets are linked 



to each other ill 641 - 1 16611 . More recently, Eom et al. used the MST as a means to 
reduce the N{N - l)/2 linkages between stocks to A'^ - 1 links, for s tudying the 
effects of market factors on the information flow between stocks lll67ll . 

To construct an MST re prese ntation of the cross-correlation matrix, Mantegna 
defined the distance metric lll43ll 



dij = V2(l - Qj\ 



(7) 



which measures the statistical distance between two financial time series / and /, 
whose cross-correlation is -1 < Cij < 1. Applying Kruskal's algorithm lll38ll . 
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a link is first drawn connecting the pair (iuji) of time series with the smallest 
distance dij^ = min(, djj. Following this, a link is drawn connecting the pair 
(h, ji) with the next smallest distance dhj^ = min(, This process is 

repeated with pairs {ik,jk) with increasingly larger distances ^4^^, until all time 
series are incorporated into the spanning graph. There is one additional constraint: 
if (z/, ji) is the next pair of time series to be linked based on their distance 
but will create a cycle in the growing graph in so doing, no link will be drawn 
between // and j/. Instead, we will skip and move on to the pair (im,jm) 

with the next smallest distance ^i?;„j„,. The spanning graph obtained at the end 
contains no cycles, hence the name minimal spanning tree. Alternatively, since 
dij is a monotonically decreasing function of C,y, we can get the same MST, if we 
start by linking the pair of time series with the largest cross-correlation, and then 
progressively linking pairs with smaller and smaller cross-correlations, so long as 
we ensure the no-cycle constraint is satisfied at all times. 



3. Macroeconomic MSTs 

Before we move on to our MST analysis, let us first develop an intuitive pic- 
ture for the sectorial structure of the US economy. As a significant fraction of 
what the US produces is consumed domestically, the US market is a gigantic con- 
sumption market. We therefore expect the noncyclical consumer goods (NC) and 
consumer services (CY) to be central players in the US economy. Furthermore, 
CY and NC consume products predominantly generated by the industrials (IN), 
thus we expect IN, CY and NC (and perhaps also FN, since financing is an impor- 
tant ingredient in US consumerism) to be the core sectors of the US economy. In 
contrast, emerging economic sectors such as telecommunications (TL) and tech- 
nologies (TC), along with less attractive economic sectors like healthcare (HC) 
and utilities (UT), contribute less significantly to the GDP, and hence sit at the 
fringe of the US economy. Finally, the oil & gas (EN) and basic materials (BM) 
sectors are strongly driven by changes in the global supply and demand cycle, and 
thus represent the US economy's connection to the global market. 

Indeed, this intuitive picture is suppor ted b y quantitative GDP data from the 



US Bureau of Economic Analysis (BEA) III68II . The BEA uses an industry clas- 
sification scheme diff"erent from the ICB, so we map the BEA data onto the ICB 
economic sectors using the subindustry descriptions available, as shown in Table 
|5l to figure out what their GDP contributions are. Though somewhat overesti- 
mated because of this mapping problem, we see that CY, IN, and NC combined 
contributed on average about 7,000 billion USD to the US gross domestic product 
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(GDP), which was worth 14,200 billion USD on average between 2007 and 2009. 
These three sectors therefore contribute about 50% of the US GDP. FN, by itself, 
contributed about 3,000 billion USD to the US GDP, or about 21%. These contri- 
butions support our intuitive picture of these four sectors playing a central role in 
the US economy. Based on the subindustry breakdown in Table [5l HC probably 
contributes on average slightly more than 1,000 billion USD to the US economy 
annually, or about 7% of the GDP. The contributions from TC and TL are hard 
to nail down, because their subindustries are frequently lump with others that are 
assigned to CY, but we estimate that their contributions to the GDP are around 
the 300 billion USD mark, or about 3% of the GDP From Table [51 we see that 
UT contributes 260 billion USD (about 2%) on average annually, while BM and 
EN both contribute around the 400 billion USD mark, also around 3%, to the US 
GDP. Again, based on these numbers, we are probably not too far off guessing 
that these sectors are less critical to the US economy. 

To further verify our intuitive picture of the US economy, we computed the 
cross-correlation matrix of the time series from February 2000 to August 2008. As 
shown in Table [6l IN and CY are the most strongly correlated, with C(JN, CY) = 
0.815, while EN and TC are the least strongly correlated, with C(EN, TC) = 
0.254. Based on the average cross correlations (C), it also appears that IN is 
most strongly tied in with the rest of the sectors ({C)(IN) = 0.617), while EN is 
least strongly tied in with the rest of the US economy ((C)(EN) = 0.349). Based 
on this cross-correlation matrix, we constructed the MST shown in Fig. Ha). As 
expected, the core sectors of the US economy, IN, CY and NC, are at the centre of 
the MST. The sectors EN and BM, which represent the US economy's connection 
to the world market, sit on one end of the MST, while the sectors HC, TC, TL, 
and UT, lies on the fringe of the MST, consistent with their lesser importance to 
the US economy. Heimo et al. arrived at a sim ilar conclusion, in their MST study 
of 1 16 NYSE stocks from 1982 to 2000 lfl69ll . 

Over the period 2000 to 2009, the US Natio nal B ureau of Economic Research 
(NBER) recorded two economic contractions lll70ll . The first was from March 
2001 to November 2001. The second was from December 2007 to June 2009. 
Our own studies showed that the US economy went from a crisis phase (mid- 1998 
to mid- 2003, which contained the March 2001 to November 2001 contraction) 
into a growth phase (mid-2003 to mid-2007), and back into a crisis phase (mid- 
2007 to present, which contained the December 2007 to June 2009 contraction) 
ll26ll . We expect interesting structural differences between the MSTs constructed 
entirely within the previous crisis (2001-2002, Fig. Hb)), the previous growth 
(2004-2005, Fig. lite)), and the present crisis (2008-2009, Fig.j^d)). Indeed, we 
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Table 5: Contributions in billions of USD to the US gross domestic product (GDP) by the various 
Bureau of Economic Analysis (BEA) industries for 2007, 2008, and 2009. The industry classifica- 
tion used by the BEA differs from the ICB. While some BEA industries can be easily assigned to 
a unique DJUS economic sector, other BEA industries contain subindustries from more than one 
DJUS economic sector. A BEA industry is marked with a dagger, if we assign it to one economic 
sector, but some of its listed subindustries to other economic sectors. In these cases, the GDP 
contributions will be split between the economic sectors. On the other hand, if its subindustries 
belong to different economic sectors, but are not finely divided enough for us to split its GDP 
contributions, the BEA industry will be marked with an asterix. For BEA industries with mixed 
components in CY, I N, an d NC, we do not break the GDP contributions down by subindustries. 
Data taken from Ref. II168 I. 



Industry 


2007 


2008 


2009 


Economic 
Sector 


Gross domestic product 


14,061.8 


14,369.1 


14,119.0 




Private industries 


12,301.9 


12,514.0 


12,196.5 




Agriculture, forestry, fishing, and 


144.7 


160.1 


133.1 


NC 


hunting* 










Mining t 


91.3 


106.3 


99.1 


BM 


Oil and gas extraction 


162.9 


210.8 


141.7 


EN 


Utilities 


248.8 


262.6 


268.1 


UT 


Construction 


657.2 


623.4 


537.5 


IN/NC 


Manufacturingt 


1,138.9 


1,119.9 


1,092.1 


IN/NC 


Primary metals 


59.0 


61.5 


43.4 


BM 


Paper products 


58.6 


53.8 


56.1 


BM 


Petroleum and coal products 


149.7 


151.9 


120.0 


EN 


Chemical products 


223.2 


201.1 


216.5 


BM/HC 


Plastics and rubber products 


69.5 


59.4 


56.7 


BM 


Wholesale trade 


813.3 


822.9 


780.8 


CY 


Retail trade 


886.1 


840.2 


819.6 


CY 


Transportation and warehousing 


405.4 


418.7 


389.5 


IN/CY 


Information*! 


285.6 


293.4 


283.5 


CY 


Broadcasting and 


347.7 


359.1 


355.8 


TL 


telecommunications 
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Figure 7: The MSTs of the ten DJUS economic sectors, constructed using half-hourly time series 
from (a) February 2000 to August 2008, (b) 2001-2002, (c) 2004-2005, and (d) 2008-2009. The 
first and the third two-year windows, (b) and (d), are entirely within an economic crisis, whereas 
the second two-year window, (c), is entirely within an economic growth period. 
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Table 6: Cross-correlation matrix computed from the half-hourly time series of the ten DJUS 
economic sector indices over the period February 2000 to August 2008. Also shown are the cross 
correlations (C) of each economic sector averaged across the rest of the US economy. 
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0.261 


0.403 




0.436 


0.469 


0.325 


0.342 


0.323 


IN 


0.656 


0.815 
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0.775 
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TL 


0.435 
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0.382 


UT 


0.458 


0.433 


0.451 


0.440 


0.323 


0.445 


0.485 
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0.382 




<C) 


0.510 


0.591 


0.349 


0.555 


0.371 


0.617 


0.529 


0.484 


0.475 


0.410 



see two topologically distinct MST structures: a chain-like MST structure which 
occurs for both crises, and a star-like MST structure which occurs for the growth 
phase. Even though we only have three data points (two crises and a growth), 
we believe the generic association of chain-like MST and star-like MST to the 
crisis and growth phases respectively is statistically robust. Our reasons are two- 
fold. First, the MST is a representation based on order statistics (ranks of cross 
correlations). Results derived based on order statistics, which are insensitive to 
noise, tend to be highly robust statistically, as we have illustrated in Section 12.61 
Second, the star-to-chain transition in the MST structure as the US economy goes 
from growth into crisis cannot be brought about by a fixed quantum increase, nor 
can it be caused by a proportional increase, in correlations. These two types of 
correlational changes do not change the ordering of cross correlations among the 
ten economic sectors, and hence cannot modify the MST. 

Since noise and global shifts in correlations cannot be responsible for the star- 
to-chain or the chain-to-star transitions, correlational changes that accompany 
these transitions must be highly significant. Our assessment that the topology 
change in the MST is statistically significant is further supported by the observa- 
tions by Onnela et ah, who looked at the mean occupation level around the most 
connected node in their MST, and found the mean occupation level becoming low 
during market crashes [Il53l - ll55n . This is the same phenomenon we see for the 
star-to-chain evolution, at the microscopic scale of individual stocks. In the next 
two sections, we will investigate the characters of these correlational changes. 
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and discuss the implications for early detection of true economic recovery based 
on the chain-to-star transition. 

From Fig.|71 we also see that in both the crisis and growth phases, IN is found 
be the central industry of the US economy. This is understandable, since the 
United States is a highly industrialised country with IN driving the rest of the 
sectors. However, when the US economy went from the mid-2003 to mid-2007 
economic growth into the current crisis, the IN star center shed the sectors NC, 
HC and TL, which shifted to other parts of the MST. In the restructured MST, NC 
formed the center of another cluster. We believe this is a signature of the trigger 
role played by NC in the Subprime Crisis, since homebuilders and developers 
(who are most directly affected by the waves of mortgage defaults) are classified 
under this economic sector. More interestingly, the cluster centered around NC 
consists of HC, TL, and UT, which were part of the five sectors that went first 
into the crisis phase (see Fig. 5 in Ref. |27(]). The last of these five sectors is 
IN, so it appears that correlational changes within these five sectors in July 2007 
is responsible for the main difference between the growth MST (Fig. ITJ^c)) and 
the crisis MST (Fig. Ul^))- This gross restructuring of the MST thus provides an 
interesting way to visualize how the current financial crisis propagated throughout 
the entire US economy. 

4. Segment-by-segment analysis 

Even within the macroeconomic growth and crisis phases, the DJUS economic 
sector time series are highly nonstationary. Both the cross correlations between 
the ten sectors, and the MSTs they imply, are expected to be highly dynamic. 
To understand how cross correlations change with time, we extracted the average 
cross correlations of the ten DJUS economic sectors in 11 corresponding seg- 
ments within the present financial crisis (see Fig. [8]). All four macroeconomic 
phases are represented in these 11 corresponding segments. Ranking the average 
cross correlations from highest to lowest in Table|71 we see that IN is always most 
strongly correlated to the rest of the US economy, whatever the prevailing eco- 
nomic climate, followed by CY and NC. Meanwhile, EN is most weakly coupled 
to the rest of the US economy, in most of the corresponding segments. This is 
consistent with our expectation that the oil & gas industry's strong dependence on 
global supply and demand makes it less susceptible to movements within the US 
economy. 

In general, we observe a positive correlation between the average market cross 
correlation ((C)) and the market volatility. As can be seen from Table Ul higher 
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Figure 8: Eleven corresponding segments identified in the time series of the ten DJUS economic 
sector indices after the mid-2007 onset of the present global financial crisis. In this figure, the 
corresponding segments are numbered from ' T to '11'. In the rest of the paper, we label these 
corresponding segments by their dominant volatility, and then by their order of appearance, so that 
'1' = Yl, '2' = Gl, '3' = B, '4' = Y2, '5' = Rl, '6' = Y3, '7' = R2, '8' = Y4, '9' = R3, '10' = R4, 
'11'= G2. Note that Yl is the same corresponding segment identified in Fig.|5] where we show 
the temporal distributions of clustered segments of the ten DJUS economic sectors from May 23, 
2007 to August 29, 2008, instead of from May 23, 2007 to November 25, 2009 in this figure. 

average market cross correlations are generally associated with higher volatility 
phases. Specifically, in the low-volatility economic growth phase (B), the average 
market cross correlation is low, in the range 0.5 < ((C)) < 0.6, whereas in the 
moderate- volatility market correction phase (Gl, G2), the average market cross 
correlation is also moderate, in the range 0.6 < ((C)) < 0.7. In the higher- volatility 
phases (Yl, Y2, Y3, Y4; Rl, R2, R3, R4), the average market cross correlation 
is high, in the range 0.7 < ((C)) < 0.9. The higher correlations observed during 
the higher-volatility phases is consistent with the tendency for traders to panic 
and to buy and sell stocks from different sectors at the same time. Conversely, 
when the market is calm, stocks from different sectors tend to be bought and 
sold at different times, explaining the lower correlations observed for the lower- 
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Table 7: Ranks of the ten DJUS economic sectors based on their average half-hourly cross- 
correlations, over February 2000 to November 2009, as well as over the 1 1 corresponding seg- 
ments identified in Fig. [8] The average cross correlations for EN in Y3, and those for BM, EN, 
and UT in Y4, are anomalously low, even negative. Also shown are the cross correlations ((C)) 
averaged over all ten sectors, for the entire period from February 2000 to November 2009, as well 
as the 1 1 corresponding segments. 





BM 


CY 


EN 


FN 


HC 


IN 


NC 


TC 


TL 


UT 


((C)) 


Entire 


5 


2 


10 


3 


9 




4 


6 


7 


8 


0.489 


Yl 


6 


2 


10 


4 


8 




3 


5 


9 


7 


0.811 


Gl 


6 


2 


10 


5 


7 




3 


4 


9 


8 


0.738 


B 


2 


8 


10 


7 


4 




3 


5 


9 


6 


0.511 


Y2 


4 


3 


10 


8 


6 




2 


5 


7 


9 


0.700 


Rl 


7 


9 


3 


10 


8 




2 


4 


5 


6 


0.797 


Y3 


9 


2 


10 


7 


4 




5 


3 


6 


8 


0.562 


R2 


7 


3 


10 


2 


8 




6 


9 


4 


5 


0.703 


Y4 


9 


4 


10 


2 


5 




7 


3 


6 


8 


0.559 


R3 


5 


2 


10 


10 


4 




3 


6 


9 


8 


0.863 


R4 


4 


3 


5 


7 


6 


2 


1 


8 


10 


9 


0.796 


G2 


4 


2 


6 


7 


9 


1 
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volatility phases. These average market cross correlations are all higher than the 
average market cross correlations computed over the entire time series, because 
cross correlations within the US economy has been increasing over the years (see 



for example, Ref. 1117 ill ) 



4.1. MST structures 

As expected, changes in the MST can be seen going from one corresponding 
segment to the next (see Fig.©. However, the sectors IN, CY and NC remain at 
the cores of all 1 1 MSTs, whereas the sectors HC, TC, TL, and UT are mostly 
found at the fringes of these MSTs. Interestingly, the financials (FN), which is 
frequently found close to the core, occasionally drifts out to the fringe. While 
the core-and-fringe structure of the MSTs remains well defined as the market 
volatility changes, we observe shifting relative importances between the differ- 
ent sectors. We wU study these MST rearrangements, which we believe are the 
US economy's response to shocks originating within specific economic sectors, in 
Section |5l Here, let us make the remarkable observation that, through the fluxes 
of correlational changes, the EN-BM-IN-CY-NC-TC-HC backbone of the MSTs 
remained relatively unchanged throughout the entire crisis period. This robust 
correlational structure must therefore be a key to understanding the performance 
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of the present US economy. 

In Fig. [91 we incorporate more visual information on the cross-correlation 
matrix, by varying the widths of the bonds in the MSTs. The thicker the bond 
between two sectors, the stronger their correlations. As we can then see, sectors 
at the core are generally more strongly correlated than those on the fringes of the 
MSTs. This reinforces our intuitive picture that sectors on the fringe are more 
detached from the overall economy, whereas those at the core are most important 
to the US economy. In this representation of the MSTs, a correlational core con- 
sisting of thick bonds can also be seen. Even as the core and backbone of the 
MSTs remain more or less unchanged, the correlational core of thick bonds ex- 
pands and contracts with time. We can think of the correlational core defining the 
active participants in the US economy for a given corresponding segment. In the 
high-volatility phase, the correlational core expands all the way out to the fringes, 
telling us that fringe sectors become more involved in the US economy during a 
financial crisis. A similar phenomenon was observed by Onnela et al. in the MSTs 



of individual stocks across market crashes [Il53l - ll55h 
4.2. MST dynamics 



In Ref. 112711 . we developed a causal tree analogy speculating that exogenous 
shocks shaking the root of the tree will first be felt strongly by branches closest 
to the root, and then weakly by branches further from the root. Naturally, now 
that we have a better picture of the correlational structure of the US economy in 
the form of an MST, we expect volatility shocks to propagate along the invariant 
backbone of the MST, since it is along this backbone that we have the strongest 
cross correlations. To explore this idea, we make use of high-resolution temporal 
information available from the segmentation/clustering analysis, to identify for 
each corresponding segment the statistically significant start dates in the ten DJUS 
economic sectors. We then rank the start dates from earliest to latest, and in Fig.H] 
label the sectors according to these ranks, omitting those sectors for which the start 
date cannot be identified. From the 1 1 corresponding segments identified within 
the present financial crisis, we find that shocks always originate from the fringe 
of the MST, and propagate inwards. However, contrary to our naive expectations, 
shocks do not necessarily propagate along the MST. For example, in Fig. Oh), we 
see that the corresponding segment Y4 started in TL, propagated to EN (which 
is not directly connected to TL in the MST), and then onto TC and FN (both of 
which are not directly connected to TL or EN), before propagating into the core 
of the MST. This inward propagation of volatility shocks is seen even when the 
MST is anomalous. For example, in Fig. |9te), where TC is at the center of the 
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Figure 9: MSTs of the ten DJUS economic sectors for the corresponding segments (a) Yl, (b) Gl, 
(c) B, (d) Y2, (e) Rl, (f) Y3, (g) R2, (h) Y4, (i) R3, (j) R4, (k) G2, within the present financial 
crisis. In this figure, thicker bonds represent stronger cross correlations, whereas the number 
besides each sector indicates the order with which the sector made the transition into the given 
corresponding segment (whereever they can be identified). 
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Figure 9: (continued) MSTs of the ten DJUS economic sectors for the corresponding segments 
(a) Yl, (b) Gl, (c) B, (d) Y2, (e) Rl, (f) Y3, (g) R2, (h) Y4, (i) R3, (j) R4, (k) G2, within the 
present financial crisis. In this figure, thicker bonds represent stronger cross correlations, whereas 
the number besides each sector indicates the order with which the sector made the transition into 
the given corresponding segment (whereever they can be identified). 
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MST, the corresponding segment Rl started first in FN, which has moved to the 
fringe of the MST, then in TL, then in EN, and BM, before propagating into the 
core of the MST. In no case was a shock found to start at the core of the MST. 



Table 8: Ranks of identifiable start dates in the ten DJUS economic sectors, from earliest to latest, 
for each of the 1 1 corresponding segments between May 2007 and November 2009. 
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Looking at the leading sectors more closely, we find a mix between shocks 
starting in EN and BM, and shocks starting in the fringe domestic sectors. In 
Table [81 we rank the start dates in the ten sectors from earliest to latest, for each 
of the 1 1 corresponding segments. In cases where we have joint leaders, for ex- 
ample, EN and FN in Gl, we split the count between them. In this way, we find 
that out of the 1 1 volatility shocks, only two and a half originated from EN and 
BM. The remaining eight and a half shocks originated in fringe domestic sectors 
which are elTectively not coupled to the global market. This suggests that the US 
economy experiences internal feedbacks that are stronger than its coupling to the 
global economy. More interestingly, we find in Fig.|9]anomalously high cross cor- 
relations at the fringe for some corresponding segments, for example, the HC-NC 
link in Yl, the TC-IN link in B, and the TL-CY link in Y4. As we can see from 
Table [H Yl started in HC, B started in TC, and Y4 started in TL. This suggests 
that fringe cross correlations frequently become anomalously high in the leading 
sector of a volatility shock. This is opposite to what we saw for the previous crisis, 
where there is a pronounced 'distancing-the-leader' effect, i.e. the cross correla- 
tions between the leader sector and all ot her se ctors are smaller than the typical 
cross correlations within the other sectors lll7in . 

Before we move on to compare the MST representation against the PMFG 
representation of cross correlations between the ten DJUS economic sectors, let 
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us see what kind of macroeconomic significance we can attach to these corre- 
sponding segments. Clearly, until detailed studies of individual episodes within 
the present financial crisis are completed and their full reports made available, we 
have to rely on news reports for our macroeconomic interpretation. Searching for 
and annotating news reports for this purpose is a very challenging task, and thus 
we will only off'er interpretations for the B, Y2, Rl, R2, R3, and R4 corresponding 
segments, where highly plausible news can be identified. As it turns out, both the 
B and Y2 corresponding segments were triggered by Federal Reserve rate cuts [Q], 
to 4.75% on September 18, 2007, and to 4.50% on October 31, 2007 respectively. 
In these two corresponding segments (Fig.[9tc) and Fig.|9td)), the volatility shock 
is of a benevolent nature, and propagated very quickly into the IN-NC core of the 
MSTs. In contrast, the Rl corresponding segment was triggered by a combination 
of the December 2007 employment si tuati on report released on January 4, 2008 
by the US Bureau of Labor Statistics lll72ll . and an Institute for Supp ly Manage- 
ment report on the US service sector released in December 2007 [| 17311 . As we can 
see in Fig. |9te), the volatility shock associated with the negative news arrived at 
the IN-NC core later, suggesting that the more open chain-like MST does indeed 
insulate its core from malign forces in the market. 

Of the remaining three corresponding segments whose news trigger we man- 
aged to identify, R2 and R3 are both related to the Lehman Brothers fiasco. After 
the US Bureau of Labo r Sta tistics released their 'worst employment report in five 
years' on June 6, 2008 lll74ll . Lehman Brothers announced on June 9, 2008 that it 
was expecting a 2.8 billion USD loss for 2008, and planned to raise 6 billion USD 
through sale of stock and convertible preferred stock [i l75i1 . These events triggered 
a very short-lived R2 corresponding segment (Fig.|9tg)), which is unusual for how 
rapidly the IN core responded to the volatility shock. After R2, the market calmed 
back down, but remained nervously in a high-volatility phase, until R3 started in 
mid-August 2008, as the demise of Lehman Brothers unfolded. On August 22, 
2008, Lehman Brothers share prices soared when re ports emerged that the Korea 
Development Bank was considering buying the bank lll76ll . When this acquisition 
fell through, Lehr nan B rothers announced on August 28, 2008 its plans to lay oflf 
1,500 employees lll77ll . With no hope of government assistance on the horizon, 
US Treasury Secretary Timoth Geithner arranged for last-minute talks over the 
weekend of September 13 and 14, 2008, to have either the Bank of America or 
Barclays buy over the entire Lehman Brothers. When this last ditch eff"ort failed, 
Lehman Brothers file for Chapter 1 1 bankruptc y pro tection on Monday, Septem- 
ber 15, 2008, citing debts of 613 billion USD il78ll . Lehman Brothers was then 
broken up. Its brokerage part was sold to Barclays on September 20, 2008 [|l79ll . 
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The rest of the company, which includes franchise in the Asia Pacific region, Eu- 
rope, and the Middle East, were acquired by Nomura Holdings over a period 
spanning September 22, 2008 to October 13, 2008 [|l80ll . The R3 corresponding 
segment, which represents the tensest period in the present global financial crisis, 
ended in mid-November 2008, before the prolonged debacle ended with Lehman 
Brothers' investment management business, i nclud ing Neuberger Berman, being 
sold on December 3, 2008 to its management lll8in . Out of the 11 corresponding 
segments studied, the R3 MST (Fig. Oi)) was the most chain-like. As expected, 
FN was the first to be hit by the saga. But while CY and BM were hit right af- 
ter FN, the IN core of the MST was the second last to succumb to the Lehman 
Brothers volatility shock. This is another strong testimony to the correlational 
insulation effect afforded by the chain-like topology in the MST. 

Finally, we identify the R4 corresponding segment, which started in late De- 
cember 2008 and ended in early May 2009, with the combination of crisis faced 
by the US automobile industry, worries about liquidity levels in US banks, and 
how the Federal Reserve was handling the financial crisis . In mid-November, 
Chrysler, Ford, and General Motors testified before the US Senate that they were 
in urgent need of government assistance lll82n . This was followed by partisan pol- 
itics delaying the rescue efforts. While this crisis was playing out on the American 
consciousness, investors were probably also getting worried about the imminent 
Janua ry 30 , 2009 expiry of temporary exceptions to sections 23A and 23B limita- 
tions lll83n . These temporary exceptions allowed existing funds to be more easily 
shared within financial groups, and were part of the many measures announced 
by the Federal Reserve to ensure more efficient use of liquidity by the banks. The 
Federal Reserve extended temporary excepti ons to section 23A, and eventually 
allowed these to expire on October 30, 2009 lll84ll . But the news that most likely 
triggered the R4 segment amidst this backdrop of negative news, was the Decem- 
ber 30, 2008 press relea se th at the Federal Reserve will purchase toxic assets and 
make emergency loans 11 18511. a nd the market's realization that this will be to the 
tune of 1.2 trillion USD lll86ll . The R4 corresponding segment ended in early 
May 2009, as C hrysle r filed for its inevitable Chapter 1 1 bankrupt cy pr otection 
on May 1, 2009 iflO . foUwed by General Motors one month later iflSSll . A US 
banks stress test was also carried out by the Federal Reserve for 19 o f the largest 
US banks. This started in April 2009 and ended in early May 2009 l[l89ll . but is 
probably not strongly related to the end of R4, because this corresponding seg- 
ment ended in FN before the stress test started. Instead, the stress test showed up 
as a little blip in the temporal distribution of FN (see Fig. [8]). Looking at Fig. |9l 
we see also that the R4 MST was already rather close to being star-like, suggest- 
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ing that in spite of the turmoil within R4, which included the March 2009 stock 
markets low, the US economy was silently recovering from the crisis. 



4.3. Comparison between MST and PMFG 

The planar maximally filtered graph (PMFG) was introduced by Tumminello 
et al. to extract a representative subgr aph of the cross-correlation matrix contain- 
ing more information than th e MS T [|l90n . Since then, the method has been ap- 
plied for sector identification lll9in . to develop hierarchically nested facto r models 
II192I1 . to understan d the time horizon dependence of equity returns lll59ll . in port- 
folio optimization lll93n . and to understand the network structure of cross correla- 
tions among the world market indices lll66ll . More recently, Pozzi et al. computed 
the MSTs and PMFGs for the daily returns of 300 of the most-capitalized stocks 
on the NYSE for different window sizes between 2001 and 2003, and found that 
the center is always populat ed by stoc ks from the financial sector, whereas other 
sectors share the peripheral lll94lll95ll . This conclusion is different from what we 
arrived at based on the DJUS economic sector indices between 2002 and 2003 
(near the end of the previous financial crisis), where IN remains central, and FN 
sits on the periphery of the chain-like MST (Fig. 13b)). 

Because our main interest in this study is the present financial crisis, we did not 
construct the sectorial PMFG for the 2002-2003 period. Instead, we constructed 
the PMFGs for the three corresponding segments (Yl, Gl, B) at the start of the 
Subprime Crisis. To check that the PMFG is also a robust caricature of the cross 
correlations between the ten DJUS economic sectors, we constructed the Yl PM- 
FGs for the intervals (a), (b), and (c) identified in Section [Z6l These are shown 
in Fig. [lOl As we can see, the only differences between the PMFGs for inter- 
vals (a) and (b) are the positions and linkages of HC and TL. On the other hand, 
the PMFG for interval (c), which incorporates more than one segment for most 
economic sectors, is quite different from the PMFG for interval (a). In all three 
PMFGs, we see IN and CY play the roles of primary and secondary centers. This 
ability to reveal secondary centers in the sectorial dynamics of the US economy is 
the main advantage of the PMFG visualization has over the MST visualization. 

For example, if we adopt the very simplistic criterion of having five or more 
links to be a center in the PMFG, we see from Fig.[II]that BM (5), CY (7), IN (7), 
NC (5), and TC (5) are all PMFG centers within the corresponding segment Yl, 
CY (9), IN (6), NC (6) are the PMFG centers within the corresponding segment 
Gl, while BM (6), CY (5), IN (8), NC (5), HC (7) are the PMFG centers within 
the corresponding segment B. In particular, the PMFG structure of corresponding 
segment B suggests that IN and HC are the two epicenters of trading activities 
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(a) (b) (c) 

Figure 10: Planar maximally filtered graphs (PMFGs) of the ten DJUS economic sectors, con- 
structed using half-hourly time series from (a) July 25, 2007 to August 14, 2007; (b) July 27, 2007 
to August 10, 2007; and (c) July 23, 2007 to August 16, 2007. In this figure, solid links are strong 
Unks making up the MSTs, while dashed links are weaker links neglected in the MSTs. 



in October 2007. Since IN is most strongly linked to growth sectors (BM, CY, 
EN, FN, NC, TC) in the US economy, while HC is most strongly linked to quality 
sectors (TL, UT), we believe we are seeing the signatures of a 'flight to quality' 
in the early stages of the Subprime Crisis. Unlike the 'flights to quality' studied 
by economists (see for example, the recent works by Phillips and Yu, who tracked 
the massive flow of funds from US technology stocks to the US property market 
to commodities to the bond market, each time generating a bubble that crashed 
when the funds leave [ 196 . [l97|] ). the phenomenon we are seeing is within the 
same asset class. 



5. MST rearrangements 

Up till this point, we understood from our combined segmentation/clustering 
and cross-correlational analyses that the MST of the ten DJUS economic sectors 
presents a star-like topology during economy growth, and a chain-like topology 
during financial crisis (see Fig.|7]). These two limiting MSTs, along with those of 
intermediate topologies, can also be seen at the mesoscopic scale of corresponding 
segments within the present financial crisis (see Fig. |9l). For each corresponding 
segment, we then looked at the sectorial distribution of strong cross correlations, 
and the temporal order in which sectors made the transition, to find that strong 
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Figure 11: Planar maximally filtered graphs (PMRGs) of the corresponding segments Yl, Gl, and 
B identified in Fig. [8] In this figure, solid links are strong links making up the MSTs, while dashed 
Unks are weaker links neglected in the MSTs. 



cross correlations are frequently found at the fringe of the MST, where the volatil- 
ity shocks always originate. In this section, we address the most natural question 
that follows: what are the natures of the correlational changes, visualized as MST 
rearrangements, that accompany these transitions? 

5. 1 . Minimal MST rearrangements 

If we treat the MST like a molecule, the MST rearrangements that occur from 
one corresponding segment to the next can be described using the chemical lan- 
guage of bond breaking and bond formation. This analogy is useful, because it 
allows us to focus on identifying the minimal set of primitive rearrangements that 
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Figure 12: The MST of the corresponding segment Gl can be obtained from the MST of the 
corresponding segment Y 1 preceding it, by breaking the TL-IN and UT-CY bonds, and forming 
new bonds between TL-NC and UT-NC. 



occur in the MST, an example of which is shown in Fig. [121 Between the suc- 
cessive corresponding segments Yl and Gl identified in Fig. [81 we first note that 
the EN-BM-IN(-TC)-CY(-FN)-NC-HC backbone remains unchanged. We then 
note that TL, which is bonded to IN in Yl, and UT, which is bonded to CY in 
Yl, are both bonded to NC in Gl. This tells us that the minimal set of primitive 
rearrangements necessary to get from the Yl MST to the Gl MST consists of the 
breaking of the TL-IN and UT-CY bonds, and the formation of the TL-NC and 
UT-NC bonds. We also see from Fig. [I2l that, as expected, all MST cross corre- 
lations decreased going from Yl to Gl. In fact, all cross correlations decreased 
going from Yl to Gl. Therefore, to have the above rearrangements, we need 
C(TL, NC) to weaken slower than C(TL, IN), or have C(TL, IN) weaken faster 
than C(TL, NC). Similarly, we need C(UT, NC) to weaken slower than C(UT, IN), 
or have C(UT, CY) weaken faster than C(UT, NC). In any case, we need at least 
one cross correlation within the (TL-IN, TL-NC) and (UT-CY, UT-NC) pairs of 
cross correlations to be anomalous, for the rearrangement to occur. 

With this 'chemical' understanding of minimal MST rearrangements, we now 
proceed to investigate the cross-correlational changes going from corresponding 
segments Gl to B to Y2, as shown in Fig. [131 Accompanying the Yl to Gl tran- 
sition, we saw a chain-like MST rearranging into another chain-like MST. For the 
Gl to B to Y2 transitions, we see the more interesting MST rearrangements from 
chain-like to star-like, and then to a topology intermediate between a chain and a 
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Figure 13: The primitive MST rearrangements going from Gl to B to Y2. The MST went from 
chain-Hke in Gl to star-like in B, to an intermediate topology in Y2. 



Star. As expected, more primitive rearrangements are needed to bring about the 
chain-to- star transition going from the moderate- volatility Gl to the low- volatility 
B. Ignoring the change in sector directly bonded to IN within the CY-FN pair, we 
see that three bonds have to be broken and reformed. These three bonds are sig- 
nificant, because NC is nearly a star center in the Gl MST, but loses the status 
after the three bonds are broken. Of course, the bonds reformed around IN, mak- 
ing it the star center of the B MST. A similar interaction between NC and IN 
occurs again for the B to Y2 transition, where NC becomes central again, with 
the breaking of the UT-IN and HC-UT bonds to reform around NC. Since these 
corresponding segments are right after the start of the Subprime Crisis, it is no 
wonder that NC features so prominently in the MST rearrangements. 

Quantitatively, we expect cross correlations to fall generically between all sec- 
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tors, when the US economy progressed from the moderate-volatility Gl segment 
to the low-volatility B segment. This can be seen easily from the thick bonds in 
the Gl MST compared to the thin bonds in the B MST in Fig. [HI In fact, the drop 
in average cross correlations of CY is anomalously large, from (C)(CY) = 0.80 
in Gl, to (C)(CY) = 0.45 in B. In addition, when all other cross correlations were 
falling, that between UT and HC increased slightly. This bucking of the trend 
makes the correlational changes between UT and HC highly significant statisti- 
cally. Subsequently, when the US economy progressed from the low-volatility B 
segment to the high-volatility Y2 segment, the cross correlation between UT and 
HC decreased, when the cross correlations between all other sectors increased. 

5.2. Early detection of economic recovery 

Speaking of 'green shoots' of economic revival that were evident at the time. 
Federal Reserve chairman Ben Bernanke predicted that "America's worst reces- 
sion in decades will likely end in 2009 before a recovery gathers steam in 2010" 
ill] . After learning that the MST of the ten D JUS economic sectors is star-like and 
chain-like during the low-volatility economic growth phase and high-volatility 
economic crisis phase respectively, we look out for a star-like MST in the time 
series data of 2009 and 2010. Star-like MSTs can also be found deep inside an 
economic crisis phase. However, within the crisis phase, these star-like MSTs 
very quickly unravel to become chain-like MSTs. On the other hand, the star- 
shape topology is extremely robust and stable within the growth phase. Therefore, 
a persistent star-like MST, if it can be found, may be interpreted as the statistical 
signature that the US economy is firmly on track to full recovery (which may take 
up to two years across all sectors). 

More importantly, the number of primitive rearrangements needed to trans- 
form the MST of a given period into a star-like MST indicates how close we are 
to the actual recovery. We can use this feature of the prerecovery MST for the 
early detection of economic recovery. This should be possible whether the star- 
like MST is a cause, in the sense that such a correlational structure within the US 
economy promotes growth, or an effect, in the sense that economic growth natu- 
rally results in this MST topology. Indeed, when we inspect the MST structure of 
the moderate- volatility G2 segment in Sep 2009, we find that it is already star-like, 
with IN as the star center. From Fig. [14l we see that it is two to three primitive 
rearrangements away from the growth MST of 2004-2005. Therefore, based on 
the time series data up till 25 Nov 2009, the statistical evidence summarized in 
the MST suggests that the US economy was already in the pre-recovery stage, 
and Bernanke might be prophetic to call for an actual economic recovery in 2010. 
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Figure 14: Comparison of the MSTs for (a) the 2004-2005 growth period, and (b) the moderate- 
volatiHty segment around September 2009. 



5.3. Comparison between previous and present recoveries 

Since the time series data we have covers the recovery periods for both fi- 
nancial crises, we wanted also to compare the sequence of pre-recovery MSTs 
for the previous crisis against the ones we see for the present crisis. We im- 
mediately encountered two problems. First, volatility movements in the various 
sectors between 2002 and 2004 are much less coordinated than those we find 
in the present financial crisis, and thus it is difficult to find corresponding seg- 
ments. Second, for successive corresponding segments that we can find for the 
2002-2004, successive MSTs are structurally very different from each other, sug- 
gesting very violent rearrangements within the MSTs. In Fig. [151 we show the 
MSTs of four successive corresponding segments identified before and after the 
11 Sep 2001 attack, from August 2001 to December 2001. These corresponding 
segments, an August 2001 moderate-volatility segment before the 11 Sep 2001 
attack, a two-week extremely-high-volatility segment right after the attack, a Oc- 
tober 2001 high- volatility segment following this, and a November-December 
2001 moderate-volatility segment afterwards, are amongst the most well-defined 
ones that we can identified through the previous financial crisis. As we can see, 
it is impossible to assign a small number of bonds that must be broken and re- 
formed to go from one MST to the next. Throughout the violent rearrangements, 
IN remained at the center of the MSTs. We also see that the MST is chain-like 
before and after the 11 Sep 2001 attack, became briefly star-like in the October 
2001 high- volatility segment, before unravelling again to a chain-like MST for the 
rest of the year. 
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(a) (b) 




(c) (d) 

Figure 15: MSTs for successive corresponding segments: (a) moderate-volatility segment before 
the 11 Sep 2001 attack on the World Trade Center; (b) extremely-high- volatility segment right 
after the 11 Sep 2001 attack; (c) high-volatility segment following; and (d) moderate-volatility 
segment following. 



6. Conclusions 

To summarize, we performed a cross-section analysis on the high-frequency 
time series of the ten DJUS economic sector indices between February 2000 and 
November 2009, to discover statistical signatures that can be used to forecast eco- 
nomic recovery. The half-hourly time series of these indices are first segmented 
individually using a recursive entropic segmentation scheme. The segments of 
each economic sector are then hierarchically clustered into between four and 
seven clusters, representing the growth, crisis, correction, and crash macroeco- 
nomic phases. In our previous study [0], we compared the temporal distributions 
of clustered segments across all ten economic sectors, to see that the US economy 
emerged from the previous technology bubble financial crisis starting mid-2003, 
enjoyed a four-year period of growth, and then succumbed to the present property 
bubble financial crisis starting mid-2007. From this cross-section of temporal dis- 
tributions of clustered segments, we also see the US economy taking one and a 
half years to completely recover from the previous financial crisis, but only two 
months to completely enter the present financial crisis. More interestingly, for the 
present financial crisis, we find the volatility dynamics within the US economic 
sectors to be strongly driven by interest rate cuts by the Federal Reserve. Of the 
seven interest rate cuts made over 2007 and 2008, the first two lowered market 
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volatilities, the next two raised market volatilities, while the last three had no per- 
manent effect on market volatilities. 

In this paper, we extended the cross-section analysis, by constructing the 
cross -correlation matrices and therefrom the MSTs of the ten DJUS economic 
sectors first over February 2000 to August 2008, and the two-year intervals 2002- 
2003, 2004-2005, 2008-2009, as well as the 1 1 corresponding segments iden- 
tified in Fig. [HI In general, we find stronger cross correlatons when the market 
volatility is high, and weaker cross correlations when the market volatility is low. 
We also find evidence that cross correlations within the US economy have been 
increasing over the years. In all MSTs, we find a core-fringe structure, with CY, 
IN, and NC forming the core, and HC, TL, UT residing on the fringe. In spite 
of the supposed market turmoil we expect throughout the present financial cri- 
sis, a highly conserved EN-BM-IN-CY-NC-TC-HC backbone can be identified in 
most of the MSTs. Through an enhanced visualization scheme for the MSTs, we 
see a dynamic core of strongly-correlated sectors, which expands and contracts in 
tandem with changes in the overall market volatility. In addition, for all 1 1 cor- 
responding segments studied, we find the volatility shocks starting always at the 
fringe, frequently accompanied by anomalously high cross correlations here, and 
propagating inwards towards the core of the MST. These volatility shocks orig- 
inate mostly from the US domestic fringe sectors, which are weakly coupled to 
the world market, instead of coming from EN and BM, which are most strongly 
coupled to the global supply and demand cycles. 

More importantly, we see that the MSTs of the ten DJUS economic sectors 
can be classified into two distinct topologies: star-like and chain-like. The MST 
is robustly star-like during economic growth, with IN at the center, and robustly 
chain-like within an economic crisis. For the present financial crisis, the MST of 
a corresponding segment can be obtained from the MST of the preceding corre- 
sponding segment through a small set of primitive rearrangements. In contrast, 
very violent rearrangements are implied going from the MST of one correspond- 
ing segment to the MST of the next corresponding segment within the previous, 
mid- 1998 to mid-2003, financial crisis. This suggests that the US economy has 
become more efficient in processing information arising from volatility shocks. 
Combining these two observations, we postulated that the star-like MST seen in 
the Sep 2009 G2 corresponding segment indicates that the US economy was in 
the early stages of economic recovery. 

After this study was completed, US market volatilities remained moderate to 
high until the start of May 2010, when investor confidence was again tested, first 
by the glitch in the NYSE electronic trading platform, and then by the unfolding 
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Greek Debt Crisis. Market volatilities skyrocketed, and even after the European 
Union announced their bailout plan for Greece, the atmosphere of economic un- 
certainty lingered. When interviewed in July 2010 on NBC's "Meet the Press" 
programme, US Treasury Secretary Timothy Geithner acknowledged the sl ow re - 
covery of the US economy, but added that it is gradually gaining strength [|l98n . 
A commentary that appears the same day Geithner' s interview was aired com- 
plicates the mood, by citing economists who warn that rece nt gains in the stock 
market need not be an indicator of economic recovery lll99ll . In fact, on 24 Aug 
2010, world stock markets fell over concerns that the yen is too strong for the 
good of the Japanese economy, and also over more bad news anticipated from the 
US economic reports due to be released the same week ll200ll . 
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Figure 16: MSTs for four successive corresponding segments straddling the Greek debt crisis: 
Grl (extremely high volatility, 21 Jan to 15 Feb 2010), Gr2 (moderate to high volatility, 1-3 1 Mar 
2010), and Gr3 (extremely high volatility, 1 May to 15 Jul 2010). 



To check if the US economic recovery might have been derailed by the Greek 
Debt Crisis, we segmented the DJUS economic sector time series from January to 
July 2010, and constructed MSTs for three corresponding segments. The MSTs 
for these three corresponding segments, the extremely-high-volatility Grl seg- 
ment (21 Jan- 15 Feb 2010) and moderate- volatility Gr2 segment (1-31 Mar 2010) 
before the Greek Debt Crisis, and the extremely-high-volatility Greek Debt Crisis 
Gr3 segment itself (1 May-15 Jul 2010), are shown in Fig. [161 As we can see, even 
though market volatilities are high, the MST presented a very robust star shape in 
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all three corresponding segments. While they are nervous, it appears that investor 
sentiments during the Greek Debt Crisis are distinctly different from those seen 
over 2008 and the first half of 2009. Judging from the increasingly star-like MST, 
it appears that the US economy is staying its course to the long-awaited economic 
recovery. 
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Appendix A. Top components of Dow Jones US economic sector indices 



Appendix A.l. Basic Materials 



ISIN/Ticker 


Company 


Adjusted Weight 


FCX 


Freeport-McMoRan Copper & Gold Inc. 


10.29% 


DD 


E.I. DuPont de Nemours & Co. 


9.12% 


DOW 


Dow Chemical Co. 


7.70% 


NEM 


Newmont Mining Corp. 


6.14% 


PX 


Praxair Inc. 


6.13% 


APD 


Air Products & Chemicals Inc. 


3.67% 


BTU 


Peabody Energy Corp. 


3.38% 


AA 


Alcoa Inc. 


2.89% 


PPG 


PPG Industries Inc. 


2.78% 


ECL 


Ecolab Inc. 


2.41% 
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Appendix A.2. Consumer Services 



ISIN/Ticker 


Company 


Adjusted Weight 


WMT 


Wal-Mart Stores Inc. 


7.28% 


MCD 


McDonald's Corp. 


5.33% 


DIS 


Walt Disney Co. 


4.15% 


AMZN 


Amazon.com Inc. 


3.90% 


HD 


Home Depot Inc. 


3.28% 


CVS 


CVS Caremark Corp. 


2.70% 


CMCSA 


Comcast Corp. CI A 


2.64% 


TGT 


Target Corp. 


2.44% 


DTV 


DIRECTV Group Inc. 


2.30% 


WAG 


Walgreen Co. 


2.18% 



Appendix A. 3. Oil &■ Gas 



ISIN/Ticker 


Company 


Adjusted Weight 


XOM 


Exxon Mobil Corp. 


25.57% 


cvx 


Chevron Corp. 


11.68% 


SLB 


Schlumberger Ltd. 


7.59% 


COP 


ConocoPhillips 


6.01% 


OXY 


Occidental Petroleum Corp. 


5.14% 


APA 


Apache Corp. 


2.96% 


HAL 


Halliburton Co. 


2.47% 


APC 


Anadarko Petroleum Corp. 


2.28% 


DVN 


Devon Energy Corp. 


2.08% 


NOV 


National Oil well Varco Inc. 


1.84% 



Appendix A.4. Financials 



ISIN/Ticker 


Company 


Adjusted Weight 


JPM 


JPMorgan Chase & Co. 


7.50% 


WFC 


Wells Fargo & Co. 


6.73% 


BAC 


Bank of America Corp. 


5.50% 


C 


Citigroup Inc. 


5.04% 


GS 


Goldman Sachs Group Inc. 


3.40% 


BRK/B 


Berkshire Hathaway Inc. CI B 


3.38% 


AXP 


American Express Co. 


2.32% 


USB 


U.S. Bancorp 


2.30% 


V 


VISA Inc. CI A 


1.85% 


BK 


Bank of New York Mellon Corp. 


1.65% 
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Appendix A. 5. Healthcare 



ISIN/Ticker 


Company 


Adjusted Weight 


JNJ 


Johnson & Johnson 


12.56% 


PFE 


Pfizer Inc. 


9.68% 


MRK 


Merck & Co. Inc. 


7.82% 


ABT 


Abbott Laboratories 


5.28% 


AMGN 


Amgen Inc. 


3.72% 


BMY 


Bristol-Myers Squibb Co. 


3.20% 


UNH 


UnitedHealth Group Inc. 


3.03% 


MDT 


Medtronic Inc. 


2.68% 


LLY 


Eli Lilly & Co. 


2.44% 


GILD 


Gilead Sciences Inc. 


2.26% 



Appendix A.6. Industrials 



ISIN/Ticker 


Company 


Adjusted Weight 


GE 


General Electric Co. 


10.45% 


UTX 


United Technologies Corp. 


4.03% 


MMM 


3M Co. 


3.39% 


UPS 


United Parcel Service Inc. CI B 


3.10% 


CAT 


Caterpillar Inc. 


2.98% 


UNP 


Union Pacific Corp. 


2.77% 


BA 


Boeing Co. 


2.58% 


EMR 


Emerson Electric Co. 


2.57% 


HON 


Honeywell International Inc. 


2.15% 


DE 


Deere & Co. 


1.95% 



Appendix A. 7. Consumer Goods 



ISIN/Ticker 


Company 


Adjusted Weight 


PG 


Procter & Gamble Co. 


13.34% 


KO 


Coca-Cola Co. 


10.35% 


PM 


Philip Morris International Inc. 


8.02% 


PEP 


PepsiCo Inc. 


7.91% 


F 


Ford Motor Co. 


4.09% 


MO 


Altria Group Inc. 


3.85% 


KFT 


Kraft Foods Inc. CI A 


3.73% 


CL 


Colgate-Palmolive Co. 


2.90% 


MON 


Monsanto Co. 


2.50% 


NKE 


Nike Inc. CI B 


1.97% 
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Appendix A.8. Technology 


ISIN/Ticker 


Company 


Adjusted Weight 


AAPL 


Apple Inc. 


13.57% 


MSFT 


Microsoft Corp. 


9.33% 


IBM 


International Business Machines Corp. 


8.58% 


GOOG 


Google Inc. CI A 


6.52% 


INTC 


Intel Corp. 


5.65% 


CSCO 


Cisco Systems Inc. 


5.31% 


ORCL 


Oracle Corp. 


4.98% 


HPQ 


Hewlett-Packard Co. 


4.73% 


QCOM 


Qualcomm Inc. 


3.61% 


EMC 


EMC Corp. 


2.11% 


Appendix A.9. 


Telecommunications 




ISIN/Ticker 


Company 


Adjusted Weight 


T 


AT&T Inc. 


44.32% 


VZ 


Verizon Communications Inc. 


24.29% 


AMT 


American Tower Corp. CI A 


5.45% 


CTL 


Century Link Inc. 


3.47% 


S 


Sprint Nextel Corp. 


2.99% 


CCI 


Crown Castle International Corp. 


2.75% 


Q 


Qwest Communications International Inc. 


2.67% 


FTR 


Frontier Communications Corp. 


2.44% 


VMED 


Virgin Media Inc. 


2.03% 


NIHD 


Nil Holdings Inc. 


1.72% 


Appendix A. 10. 


Utilities 




ISIN/Ticker 


Company 


Adjusted Weight 


SO 


Southern Co. 


6.68% 


EXC 


Exelon Corp. 


5.61% 


D 


Dominion Resources Inc. (Virginia) 


5.28% 


DUK 


Duke Energy Corp. 


4.94% 


NEE 


NextEra Energy Inc. 


4.52% 


PCG 


PG&E Corp. 


3.96% 


AEP 


American Electric Power Co. Inc. 


3.67% 


PEG 


Public Service Enterprise Group Inc. 


3.38% 


SE 


Spectra Energy Corp. 


3.32% 


ED 


Consolidated Edison Inc. 


2.93% 
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